672 research outputs found
Recommended from our members
New ideas and emerging research: evaluating prediction system accuracy
BACKGROUND: Prediction e.g. of project cost is an important concern in software engineering. PROBLEM: Although many empirical validations of software engineering prediction systems have been published, no one approach dominates and sense-making of conflicting empirical results is proving challenging. METHOD: We propose a new approach to evaluating competing prediction systems based upon an unbiased statistic (Standardised Accuracy), analysis of results relative to the baseline technique of guessing and calculation of effect sizes. RESULTS: Two empirical studies are revisited and the published results are shown to be misleading when re-analysed using our new approach. CONCLUSION: Biased statistics such as MMRE are deprecated. By contrast our approach leads to valid results. Such steps will greatly assist in performing future meta-analyses
Group project work from the outset: an in-depth teaching experience report
This article is an extended version of a paper that was submitted to 24th IEEE Conference on Software Engineering Education and Training, Honolulu, May 2011CONTEXT - we redesigned our undergraduate computing programmes to address problems of motivation and outdated content.
METHOD - the primary vehicle for the new curriculum was the group project which formed a central spine for the entire degree right from the first year.
RESULTS - so far this programme has been successfully run once. Failures, drop outs and students required to retake modules have been halved (from an average of 21.6% from the previous 4 years to 9.5%) and students obtaining the top two grades have increased from 25.2% to 38.9%.
CONCLUSIONS - whilst we cannot be certain that all improvement is due to the group projects informally the change has been well received, however, we are looking for areas to improve including the possibility of more structured support for student metacognitive awareness
The scientific basis for prediction research
Copyright @ 2012 ACMIn recent years there has been a huge growth in using statistical and machine learning methods to find useful prediction systems for software engineers. Of particular interest is predicting project effort and duration and defect behaviour. Unfortunately though results are often promising no single technique dominates and there are clearly complex interactions between technique, training methods and the problem domain. Since we lack deep theory our research is of necessity experimental. Minimally, as scientists, we need reproducible studies. We also need comparable studies. I will show through a meta-analysis of many primary studies that we are not presently in that situation and so the scientific basis for our collective research remains in doubt. By way of remedy I will argue that we need to address these issues of reporting protocols and expertise plus ensure blind analysis is routine
An empirical investigation of an object-oriented software system
This is the post print version of the article. The official published version can be obtained from the link below.This paper describes an empirical investigation into an industrial object-oriented (OO) system comprised of 133,000 lines of C++. The system was a subsystem of a telecommunications product and was developed using the Shlaer-Mellor method. From this study, we found that there was little use of OO constructs such as inheritance and, therefore, polymorphism. It was also found that there was a significant difference in the defect densities between those classes that participated in inheritance structures and those that did not, with the former being approximately three times more defect-prone. We were able to construct useful prediction systems for size and number of defects based upon simple counts such as the number of states and events per class. Although these prediction systems are only likely to have local significance, there is a more general principle that software developers can consider building their own local prediction systems. Moreover, we believe this is possible, even in the absence of the suites of metrics that have been advocated by researchers into OO technology. As a consequence, measurement technology may be accessible to a wider group of potential users
Recommended from our members
A systematic review of software development cost estimation studies
This paper aims to provide a basis for the improvement of software estimation research through a systematic review of previous work. The review identifies 304 software cost estimation papers in 76 journals and classifies the papers according to research topic, estimation approach, research approach, study context and data set. A web-based library of these cost estimation papers is provided to ease the identification of relevant estimation research results. The review results combined with other knowledge provide support for recommendations for future software cost estimation research, including: 1) Increase the breadth of the search for relevant studies, 2) Search manually for relevant papers within a carefully selected set of journals when completeness is essential, 3) Conduct more studies on estimation methods commonly used by the software industry, and, 4) Increase the awareness of how properties of the data sets impact the results when evaluating estimation methods
Evaluating prediction systems in software project estimation
This is the Pre-print version of the Article - Copyright @ 2012 ElsevierContext: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results.
Objective: To reduce the inconsistency amongst validation study results and provide a more formal foundation to interpret results with a particular focus on continuous prediction systems.
Method: A new framework is proposed for evaluating competing prediction systems based upon (1) an unbiased statistic, Standardised Accuracy, (2) testing the result likelihood relative to the baseline technique of random âpredictionsâ, that is guessing, and (3) calculation of effect sizes.
Results: Previously published empirical evaluations of prediction systems are re-examined and the original conclusions shown to be unsafe. Additionally, even the strongest results are shown to have no more than a medium effect size relative to random guessing.
Conclusions: Biased accuracy statistics such as MMRE are deprecated. By contrast this new empirical validation framework leads to meaningful results. Such steps will assist in performing future meta-analyses and in providing more robust and usable recommendations to practitioners.Martin Shepperd was supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under Grant EP/H050329
Recommended from our members
Data sets and data quality in software engineering
OBJECTIVE - to assess the extent and types of techniques used to manage quality within software engineering data sets. We consider this a particularly interesting question in the context of initiatives to promote sharing and secondary analysis of data sets.
METHOD - we perform a systematic review of available empirical software engineering studies.
RESULTS - only 23 out of the many hundreds of studies assessed, explicitly considered data quality.
CONCLUSIONS - first, the community needs to consider the quality and appropriateness of the data set being utilised; not all data sets are equal. Second, we need more research into means of identifying, and ideally repairing, noisy cases. Third, it should become routine to use sensitivity analysis to assess conclusion stability with respect to the assumptions that must be made concerning noise levels
Recommended from our members
An evaluation of e-learning standards
The aim of this investigation is to perform an independent study of the various emerging elearning standards. This paper presents a summary of these standards in order to make them more accessible and understandable, and provide preliminary evidence as to their utility and adoption by the various UK higher and further education institutions. Recently there have been efforts to define standards for the elearning contents and elearning components like the IEEELOM, UKLOM, IMS, SCORM and OKI. Since it was not possible to cover all the standards in detail within the time available, so our independent study focuses on eight standards Although the results of the preliminary study suggest that the eight standards considered in the study may help interoperability, accessibility and reusability of the elearning content and elearning components, but it is yet to be seen how many of these are actually followed at UK higher education institutions
Comparing software prediction techniques using simulation
The need for accurate software prediction systems increases as software becomes much larger and more complex. We believe that the underlying characteristics: size, number of features, type of distribution, etc., of the data set influence the choice of the prediction system to be used. For this reason, we would like to control the characteristics of such data sets in order to systematically explore the relationship between accuracy, choice of prediction system, and data set characteristic. It would also be useful to have a large validation data set. Our solution is to simulate data allowing both control and the possibility of large (1000) validation cases. The authors compare four prediction techniques: regression, rule induction, nearest neighbor (a form of case-based reasoning), and neural nets. The results suggest that there are significant differences depending upon the characteristics of the data set. Consequently, researchers should consider prediction context when evaluating competing prediction systems. We observed that the more "messy" the data and the more complex the relationship with the dependent variable, the more variability in the results. In the more complex cases, we observed significantly different results depending upon the particular training set that has been sampled from the underlying data set. However, our most important result is that it is more fruitful to ask which is the best prediction system in a particular context rather than which is the "best" prediction system
Building on CHASM: A Study of Using Counts for the Analysis of Static Models of Processes
Process modelling is gaining increasing acceptance by software engineers as a useful discipline to facilitate both process understanding and improvement activities. This position paper builds upon previous work reported at the 1997 ICSE workshop on process models and empirical studies of software engineering (Phalp and Counsell 1997). In the previous paper, we argued that simple counts could be used to support analysis of static process models. We also illustrated the idea with a coupling measure for Role Activity Diagrams, a graphical process modelling notation adapted from Petri Nets. At that time only limited empirical work had been carried out, based upon a single industrial study, where we found high levels of coupling in an inefficient process (a more thorough description may be found in (Phalp and Shepperd 1999)). We now summarise a more recent study, which uses a similar analysis of process coupling again based on simple counts. In the study, we compared ten software prototyping processes drawn from eight different organisations. We found that this approach does yield insights into process problems, which could potentially be missed by qualitative analysis alone. This is particularly so when analysing real world processes, which are frequently more complex than their text book counterparts. One notable finding was that despite differences in size and domain, role types across the organisations exhibited similar levels of coupling. Furthermore, where there were deviations in one particular role type, this led the authors to discover a relationship between project size and the coupling levels within that type of role. Given the simplicity of our approach and the complexity of many real world processes we argue that quantitative analysis of process models should be considered as a process analysis technique
- âŠ